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ABSTRACT 

In this paper, algorithms are described for obtaining 
the maximum likelihood estimates of the parameters in log-linear 
models. Modified versions of the iterative proportional fitting and 
Newton-Raphson algorithms are described that work on the minimal 
sufficient statistics rather than on the usual counts in the full 
contingency table. This is desirable if the contingency table becomes 
too large to store. Special attention is given to log-linear Item 
Response Theory (IRT) models that are used for tne analysis of 
educational and psychological test data. To calculate the necessary 
expected sufficient statistics and other marginal suns of the table, 
a method is described that avoids summing large numbers of elementary 
cell frequencies by writing them out in terms of multiplicative model 
parameters ar.d applying the distributive law of multiplication over 
summation. These algorithms are used in the computer program LOGIMO, 
and are illustrated with simulated data for 10,000 cases. Two tables, 
3 graphs, and a 34-item list of references are included. 
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Abstract 

In this paper algorithms are described for obtaining the 
maximum likelihood estimates of the parameters in log-linear 
models. Modified versions of the iterative proportional 
fitting and Newton-Raphson algorithms are described that work 
on the minimal sufficient statistics rather than on the usual 
counts in the full contingency table. This is desirable if 
the contingency table becomes too large to store. Special 
attention is given to log-linear 1RT models that are used for 
the analysis of educational and psychological test data. To 
calculate the necessary expected sufficient statistics and 
other marginal sums of the table, a method is described that 
avoids summing large numbers of elementary cell frequencies 
by writing them out in terms of multiplicative model 
parameters and applying the distributive law of 
multiplication over summation. These algorithms are used in 
the computer program LOGIMO. The modified algorithms are 
illustrated with simulated data. 
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Computing Maximum Likelihood Estimates of 
Loglinear Models from Marginal Sums 
with Special Attention to 
Loglinear Item Response Theory 

Purpose 

Log- linear models are used increasingly to analyze 
psychological and educational tests (Cressie & Holland, 1983; 
Duncan, 1984; Kelderman, 1984, 1989; Tjur, 1982). Current 
computer programs such as GLIM <Baker 4 Nelder, 1978), ECTA 
(Goodman & Fay, 1974) and SPSS LOGLINEAR (SPSS, 1988) for 
analysis of log-linear models have limited ur-lity when used 
with models of the sire and complexity required in some :est 
and applications to test and item analysis. The computer 
program LOGIMO is especially designed for this situation. In 
this paper the algorithms used in LOGIMO are described. The 
algorithms are useful for the analysis of both ordinary log- 
linear models and log-linear IRT models. For a discussion of 
applications of log-linear IRT models L-he reader is referred 
to Duncan (1984), Duncan and Stenbf ck (1987) and (Kelderman 
(1984, 1989a, 1989b, 1991) 

In this paper three log-linear models are used to 
describe the algorithms, one ordinary log-linear model and 
two log-linear IRT models. To keep exposition simple, we 
assume that each test has four items. Needless to say. the 
results are valid also for larger numbers of items- 
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Let there be a sample of N subjects with responses i, j, 
k and 1 on four variables. The i, j, k and 1 are realizations 
of random variables with joint probability Pijkl- Consider 
the following examples of parametric models for Pijki* 

Example 1 

The first model is an ordinary log-linear model (see 
e.g. Agresti, 1984) describing interactions between 
consecut ive variables : 

Pijkl BS a ij b jkCfcl' U) 

i - 1, I; j - 1/ J; k - 1 K '" 1 = 1 L « 

where a^j, bj^, c k i are parameters to be estimated. Even 
though this simple multiplicative parameterization is not 
identifiable, it is useful for illustrating the first 
algorithm described «n the next section. An identifiable loq- 
linear formulation of the model with main and interaction 
effect terms will be presented later. 

Example 2 

Let i, j, k, 1 « 0, 1 now be dichotomous item responses 
and let m=i+j*k+l, the simple sum of item scores, be 
a new variable. Several authors (e.g. Cressie & Holland, 
1983; Kelderman, 1984) have shown that the model 



Pijklm " 



a i bjC k d 1 e m 



(2) 



I 
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is equivalent to the dichotomous Rasch (1960/1980) model. 
This is readily seen by conditioning on the sum score, which 
yields the familiar formulation of the conditional R*sch 
model (Rasch, 1980, p. 177) : 

Pijkllm - a i b j c k d l 1 \ ? \ \ a i b j c k d l- 

i+3+k+l-m 

The parameters in (2) are multiplicative main effect 
parameters describing the effect of the variables. The usual 
additive Rasch-item-di f f iculty parameters can be obtained 
from them as (log a 0 - log a^), (log bg - log b x ) , etc. They 
are unique up to an additive constant. Let us note that the 
variable m in Pij> i m is redundant because it depends 
completely on i, j, k, and 1. Now consider a two-dimensional 
log-linear IRT model. 

Example 3 

The most complicated model considered here contains two 
variables that depend on item responses. To define these 
variables, two weights are assigned to each response. These 
weights or category coefficients are positive integers 
denoted by vj(i) and w^i), v 2 (j) and w 2 (j>, v 3 (k) and w 3 (k), 
v 4 <l) and w 4 (l) for items i, j, k, and 1 respectively. New 
variables may now be defined as the simple sums of weights 
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m e v^i) ♦ v 2 <j) ♦ v 3 (k) ♦ v 4 (l), 
and <3) 
t » w 1 {i) ♦ w 2 (j) ♦ w 3<k> + w^U), 

for i - 1, 1; j - I, J; * m 1# K; 

1 = 1, . L. A two-dimensional log-linear IRT model can new 

be written as; 

Pijklmt m a i b j c k d l e mf (4) 

Kelderman (1989) showed that, for suitable choice of category 
coefficients, (4) defines a class of IRT models that includes 
the partial credit model (Masters, 1382), the 
multidimensional Rasch (Andersen, 1973; Rasch, 1961) model, 
and other interesting IRT models. It is easy to see that 
Model 4 can be expanded to include more items, more weight- 
sum variables and/or interaction terms as in Example 1. 

Problems are likely to arise with the usual algorithms 
tor maximum likelihood estimation of parameters in log-linear 
models if the number of items or weight-sum variables is 
large. Most of the currently available algorithms require the 
storage of the tables of observed and expected counts 
Ufij)a> and * F ijkl> m {N Pijki*' respectively). These tables 
can become extremely large if the number of items is not 
small. For example, if there are twelve four-response items, 
each tabic will consist of 17 million cells. 
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The algorithms described below avoid this problem by 
computing the parameters directly from certain marginal sums 
of the contingency table. The next section describes two such 
algorithms: a modified version of the iterative proportional 
fitting algorithm, and a version of the Newton-Raph? on 
algorithm. Furthermore, an efficient method to calculate the 
expected marginal sums is described at the end of the next 
section. In the applications section, the computational 
efficiency of this method is assessed, and the modified IPF 
algorithm is applied to a set of simulated data. 



Description 

If it is assumed that the subjects respond independently of 
one another, the frequencies < f i jkl J have 44 multinomial 
distribution with index N and probabilities IPijkl^- ' ;he 
likelihood of the models for sample data if 



!i jlcl 



h n n n n <Piiki } 

i j k i 



where h is a function ol the data only. The variables m and t 
are omitted in the above expression- Taking the derivatives 
of the log likelihood with respect to the parameters and 
setting them equal to zero, will yield the maximum likelihood 
equation* (see Haberman, 1979, p,448). For the model in 
Example 1 the maximum likelihood equations become 



er|c 1 1 
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f ij++ ~ F ij + + " °' 

f 4 jk + - F 4jk4 = 0 ' 

where a plus sign replacing an index denotes summation over 
that index {e.g.F i; ) 44 - { J F i jkl> • The marginal sums 
Uij 44 >, <f 4 j k4 }, and U 44 ja) are minimal sufficient 
statistics for the parameters (a^j), {bj^}, and < c ki* 
respectively. Generally, in log-linear model analysis, the 
sufficient statistics associated with par meters are the 
marginal sums with the same indices as the corresponding 
parameters. Furthermore, the likelihood equations are 
obtained by setting the observed sufficient statistics equal 
to the corresponding expected values under the model. Thus, 
for Model 2, the likelihood equations are obtained by setting 
the marginal sums ffi 444+ ), if 4 j 44 ), {t 44 j c4 ^) / {f 44 + i 4 ) and 
tf 44 A 4m J equal to the corresponding expected values {F i4444 }, 
{F 4 j 44 ), {F 44k+4 }, (F 444l4 ) and {F 4444m }. 

Solving the Equations <5) for the parameters yields the 
maximum likelihood estimates of the parameters. These 
equations can not be solved directly, but numerical 
algorithms are available for ^heir solution (e.g. Baker & 
Nelder, 1978, Goodman & Fay, 1974) 



12 



i 
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j 
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k 

1 



I; 
J. 

J; 
K . 

K; 
L. 



(5) 
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ft Modified Iterative Prop ortional Fitting Algorithm 

In iterative proportional fitting (IPF, Deming & 
Stephan, 1940), the expected cell counts (Fij^) are 
proportionally adjusted to fit the set of marginal sums 
obtained from the sample. In this section we describe a 
modified IPF algorithm to adjust parameter estimates rather 
than expected cell frequencies. This modification alleviates 
b ;th storage requirements and computational complexity 
because test-data models usually have much less parameters 
than expected frequencies. 

Let us consider regular IPF . Denoting the expected 
counts before the adjustment as {Fj[ ) and after 
adjustment as iFijki')* start the computational procedure 
by setting all F^i* = 1 . In IPF, the maximum likelihood 
estimates (E'ijki) under Model 1 are obtained by repeated 
application of the adjustments 



;<new) _ * (old) , f . . , p\<2 ld > \ 

F ijkl " F ijkl (f i}++ 1 F ij++ ' 

2(new) _ '(old) f , , F (old) 

F ijkl " F ijkl (f +jk+ 1 F +jk+ 1 

* (new) "(old) . £<old) , 

F ijkl = F ijlcl (f ++kl * F ++kl ' 



each for i - 1, • .., I; j ■ 1 > •■•> J: " 1* •••» K; 
1=1, . . . , L, until convergence is achieved. The algorithm 
will always converge to a solution satisfying Equations 5. 
The application of IPF to other models, such as those given 
in Example 2 and 3, is straightforward. 

To adjust parameter estimates rather than expected cell 
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frequencies, let us first express F^j^i in terms of 
parameters. For the first update, this becomes 

M Jnew). (new) (new) M (old) H jold) r <old) (f /F <old) . 

Because the same adjustment (f^ ) is made for all 

values of k and 1, it suffices t,o change thp parameter a A j 
only. The remaining parameters bj^ and c^i can be treated as 
constants so that b j£ ew) C) [? ew) - b jj ld) c^ d) Therefore, we 
have 



(new) A (qld) f y (old) 1*1 1- 

j = 1 # . . . , J , 



Similarly for the other updates, we have 

. <new)_ K (old) t€ /r (oid). •* - 1 7 

b jk = b jk < f +jk+' F +jk+ >' 3-1* — * J* 

k — 1, ■ • « f K , 



and 



„(new)_ „(old) t€ /p (old) . _ . K 
c kl = c kl < f ++kl /F ++kl >< * - *' 

1 7 1/ m m m 0 L • 



Within the modified IPF algorithm, only IJ + JK ♦ KL 
parameters have to be adjusted in one cycle. Compared to the 
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3<IJKL) cell frequencies in ordinary IPF, there is a 
considerable reduction in computational complexity with the 
modified IPF . We will look at this reduction much more 
closely in the section on applications. 

The IPF algorithm works with indeterminate parameters. A 
unique solution of the log-linear version of Model 1 with 
main and interaction effect parameters can be obtained by the 
following reparameterization. 

li " log a XJ + log b JK + log c KL 
ai ■ log a±j - log ajj 

= log a T j - log a ZJ + log b jK - loq bj^ 
Yjt - log b J)t - log bjj< + log c kL - log c KL 
5i = log c K1 - log c KL 

Coc^) i j = log aj_j - log ajj - log aij + log a XJ 
<PY) jk - log bj k - log b Jk - log bj K + log bjx 
OS>kl = lo 9 c kl " C K1 " lo 9 C *L + l0< ? C KL 

where C4, Pj, Yfc and 5 X are main effect parameters and 
(CtPJij, (PY) j* and (y6> kl interaction effect parameters. 

It is easy to verify that the model (i.e. (Pi j)d > ) would 
remain invariant under this reparameterization. That is: 



log aijbj k c kl - *l + 04 + Pj + Y* ♦ 5 X 

♦ (apjij + <p*Y) jk + <Y6)kl 



(6) 



and that the constraints 
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- OT)jK = <PT>jk s <T*>K1 ^ <*>kL = 0 <7 > 

are satisfied* 

This parameterization contrasts the effect of each 
category with the last. Bock (1975, p. 239) refers to this as 
the 'simple contrast'. Other parameterizations such as 
deviation contrasts, where the effect of each category is 
contrasted with the mean effect , can be obtained by similar 
transformations . 

A Newton-Raphson Algorithm 

The well-known Newton-Raphson algorithm is based on a 
second order Taylor expansion of the log- likelihood function 
(Andersen, 1980, p. 47; Adby & Dempster, 1974, p. 65). The 
algorithm iteratively computes the log-linear parameters 
using the gradient and the Hessian matrix, which can be 
written as functions of the marginal sums- Before discussing 
the Newton-Raphson (N-R) update, let us first introduce the 
matrix formulation of the log-linear formulation given in (6) 
for Model 1: 

lo 9 Pijkl " ^ + a i * Pj + ?k * 5 l 



if; 



Marginal Sums 

12 



Without loss of generality, let us assume that I ■ J » K = L 
- 2. Unlike IPF, the N-R algorithm requires the parameters to 
be identified. Therefore we impose the constraints given in 

(7). Let p = (Pllll* P2U1' P2222'' ** the vector of 

cell probabilities, and let £ » (U, a 1# Pi, Yl* 5i» <O0>ii' 
<P7>11' (T*)ii5' be the vect0r of parameters to be estimated. 
The matrix version of the model can be written as 

log p = d 

where D is the design matrix with ones and zero's in the 
appropriate places and log means the elementwise logarithm 
operator. Letting t = (fun* f 2111' ■••» f 2222> and A = 
diag(p), the gradient vector and the Hessian matrix can be 
expressed as 



3 log L 

g = = D'f - O'pN 

3 % 



and 



d 2 log L 

H = - NfD'JLjD - (D'p) (D'p) 

e $ d t' ^ 



respectively. 

These can alsr be expressed in terms of marginal sums 

since 
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D'f 



if 



♦ 11*' 



11) 



D'p - {p 1+++ , P*+l+r P+++l# Pll++# P+11+' P++ll>f 



and 



D' AD ■ 



Pl + + + 
Pll + + 
Pl + 1 + 
Pl + + 1 
Pll + + 
Pi 11 + 
Pl + 11 



Pll++ 
P+1 + + 
P+11 + 
P+l + 1 
Pll + + 
P+11 + 
P+lll 



Pl + 1 + 
P+11 + 
P+ + 1 + 
P+ + 11 
P111 + 
P+11 + 
P+ + 11 



Pl + + 1 
P+l + 1 
P+ + 11 
P+ + + 1 
Pll + 1 
P+lll 
P+ + 11 



P11++ 
P11 + + 
P111 + 
Pll + 1 
P11++ 
P111 + 
Pllll 



P111 + 
P+11 + 
P+11-+ 
P+lll 
Pill* 
P+11 + 
P+lll 



Pl + 11 
P+lll 
P++11 
P+ + 11 
Pllll 
P+-111 
P+ + 11 



(8) 



The N-R algorithm repeatedly adjusts the parameters Let 
£<old) and ^(new) be the paran , eter vectors before and after 
adjustment and let g<old) # g(new) and H (old)^ a {new) be the 
gradient and Hessian computed from them. The maximum 
likelihood estimates of £ are obtained by repeated 
application of 

£(new) = jMold) + A> 
where A is the solution of the linear system: 



H<old) A 



(old) 
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Usually the update A is computed by pre-multiplication of 
system the by the inverse of H< old) , but it is more efficient 
to solve the system directly for A (Dongarra et al, 1979; 
Holland & Thayer, 1987). Gill, Murray and Wright (1991) 
describe fast methods for solving systems of linear 
equations. The Newton Raphson algorithm converges much more 
rapidly to the maximum likelihood solution than the IPF 
algorithm but requires starting values that are close to the 
final solution. Also B requires the marginal sums given in 
(8), which are not necessary for the modified IPF algorithm. 

The most important feature of the above modifications of 
the IPF and N-R algorithms is that in neither case is it 
necessary to set up the full contingency table. Marginal sums 
alone are sufficient. Although this reduces storage, 
requirements it does not relieve us of the computational 
burden of summing over the cells of the full table, which is 
probably the reason why the above N-R procedure is never used 
in existing programs for log-linear analysis. A novel element 
in the application of the N-R algorithm and modified IPF, is 
that the marginal sums are computed in an efficient way 
described in the next section. 

Efficient Computa tion 01 Marginal Sums 

The obvious way to compute is to sum over the 

cells 

F ; + + = N L I p ijkl -Nil a i -jb^c* l , (9) 
J k 1 k 1 
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i * 1, I, j 52 If • ■ ■ * J * where the last term is used to 

avoid storage of the full table. 

Suppose that I=J»K«L*10, then (9) involves 
2<IJKL) ♦ 1 - 20 # 001 multiplications and IJ(KL - 1) - 9900 
summations. This number of computations can be reduced by 
rewriting (9), using the distributive law of multiplication 
over summation, as 

E"ij + + - N a A j I b jk I c kl , 
k 1 

i * 1, I, j ■ 1 J. This requires only 1 ♦ IJ ♦ K * 

111 multiplications and J<K-1) «■ K(L-l) » 180 summations. 
This is obviously a considerable reduction in the number of 
computations needed. 

we will refer to this method of computing the expected 
marginal sums as the marginalization-by-variable (MBV) 
method, because summations for one variable (at a time) are 
done only over parameters that depend on that: variable. 
Multiplication with parameters that do not depend on that 
variable is postponed until after the summation. 

The MBV method becomes more complicated if the model 
contains weight-sum variables, because they are dependent on 
item responses <e,g. Example 3). In that example, the values 
that a summation in the MBV method can take, may depend on 
the value of other summation variables. For example, the 
computation of F++ + + m in Model 2, can be written as 

ERIC v(i 
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F+*++ m - Hill! a^c^en 
i j k 1 

i+ j+k+l=m 

The summations over i, j, k and 1 may only be performed for 
those patterns for which i+j+k+l=m. To see what this 
means for each separate summation let us rewrite i ♦ j ♦ k ♦ 
ism into the equivalent form 

mx » i ♦ jf 

iri2 s m^ ♦ k, 
m h m2 ♦ 1 / 

where m^ and m2 are partial sum scores. 

Let £ mean the summation over the values of x and 

x,y 
x + y=z 

y for which x ♦ y ■= z; the MBV method for computing p+ + + + m 
then becomes 

F ++++m - » H * ?1 « L <f *>j«i>> (10) 

m2^ 1; k; i, j; 

n^ + l^m mi+k=m2 i+j^mj 

In the above equation, *i and bj are first multiplied for all 
i «= 0, 1 and j « 0, 1. The products for which i + j ■ raj are 
summed, which gives a separate sum for each mj 0, 1, 2), 
Each sum is then multiplied with each of the c k (k = 0, 1) 
parameters . Again, these products are summed if m x ♦ k » m 2 . 
This yields a sum for each m 2 (=0, 1, 2, 3). Finally, this 
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process of multiplication and summation is repeated one more 
time to obtain F++++ m . In this way, the marginal sums are 
computed efficiently while, at the same time, avoiding 
summation over logically impossible combinations of variable 
values . 

In a similar manner, the marginal sums for the model in 
Example 3 can also be computed- First, rewrite the weight- 
sums given in (3) as 

m^vj (i) +v 2 ( j) , m 2 =m 1 ^V3(k) , m=m 2 +v 4 (1) 

(11) 

tj-wj (i) +w 2 ( j) , t 2 1BI t 1 + W3(k) ; t=»t 2 +w 4 U> 

Under these constraints, the marginal sum P++++ m t can be 
computed as 

F*+44mt - N e mt Z d x (I c k (I b-jaj)) . 

I,m2,t 2 Jc,m 1# i, 3 

Again each summation can be performed separately if the 
constraints in (11) are respected. Obviously, the same method 
can be applied to calculate the other expected marginal sums 
such as {F+j+ + ++}, etc. Consequently, the MBV 

method can supply all marginal sums needed in the modified 
IPF or N-R algorithm. 

The modified IPF algorithm using the MBV method to 
compute expected marginal sums, is implemented in the 
computer program called LOGIMO (LQGlinear IRT MQdeling, 
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Kelderman & Steen, 1988) - LOG I MO is a Pascal program that 
estimates log-linear models with main and interaction effect 
parameters of item response, background variables and one or 
more weight-sum variables as shown in Example 3. The weights 
are integer valued and must be specified by the user. In the 
next section we present the application of the modified IFF 
and N-R algorithm. 



Application 



The complexity of computing the parameters of log-linear 
models is substantially reduced by using modified IPF and N~R 
algorithms based on marginal sums that can be computed 
efficiently by the MBV method. In this section we will 
examine the computational complexity as a function of the 
number of variables in the model. We will first look at the 
increase in computational complexity with the MBV algorithm 
and then at the full algorithm. 

In this application, we restrict our attention to the 
IPF algorithm and to the simplest model with sum scores as 
given in (2) . This model is chosen because the number of MBV 
computations is tractable and because it is equivalent to the 
di<:hotomous Rasch model. Consequently the parameter estimates 
can be compared to those of an existing algorithm for 
computing Rasch parameters and to verify the correctness of 
the algorithm. 
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Insert Table 1 here 



In Table 1, the numbers of summations and 
multiplications in the computation of F+_ t+m of the simple 
sum-score model (2) are given for five to 20 items. It can be 
seen that for the MBV algorithm these numbers remain within 
reasonable limits, whereas, for the case of summing over all 
cells (9)/ these numbers increase very rapidly. 

To evaluate the full IPF algorithm, test data conforming 
to the Rasch model were generated for 20 items. The item 
difficulties where randomly chosen from the uniform 
distribution over the interval (-2,2]. Latent trait values 
for 10,000 cases were drawn from a uniform distribution over 
the [-3,3] interval. Log-linear kasch models given in (2) 
were then fitted to these data. Nine computer runs were made 
for different subsets oi items, where the first subset 
contained the first four items, the second subset contained 
the first six items etc. In Figures 1, 2, and 3, different 
statistics of these runs are plotted against the number of 
items in the model. 



Insert Figures 1, 2 and 3 about here 
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In Figure 1 the number of IPF iterations needed to 
arrive at the maximum likelihood solution is plotted against 
the number of items. Iterations were performed on a VAX 8750 
computer until none of the parameter estimates could be 
improved by more than .005. It is seen that the relationship 
between the number of items and the number of iterations 
needed for convergence is approximately linear. 

As the number of items increases, the CPU time needed 
for each of these iterations will also increase. In Figure 2, 
the mean CPU time per iteration is plotted against the number 
of items. It can-be seen that the CPU time increases steeply 
with the number of items but stays within reasonable limits 
for moderate numbers of items* In Figure 3, the total CPU 
time for IPF iterations and for initializing the algorithm is 
plotted against the number of items. Initialization time 
includes data input, computing marginal sums and creating 
data structures for storage. According to Figure 3, the CPU 
times for initialization increases almost linearly with the 
number of items and the iteration time does not increase 
dramatically with the number of items in the test. 

In Table 2 the real item difficulties and the estimated 
item difficulties values of all 20 items are given. The item 
parameter estimates were obtained by the LOGIMO program and 
by the PML {Gustafsson, 1977, 1980) program. The PML program 
calculates the CML estimates of the item parameters with 
Andersen's (1972) method. In both cases the tirst item 
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difficulty parameter was set equal to its real value. 
Furthermore, the iterations were stopped until none of the 
parameter estimates could be improved by more than .0001. It 
can be seen from Table 2 that both solutions are identical up 
to the second decimal place, indicating that the IPF/MBV 
algorithm correctly calculates maximum likelihood estimates. 



Insert Table 2 here 



Finally a note on the usefulness and availability of 
LOGIMO. For ordinary log-linear models, provided they are not 
too complicated, LOGIMO makes it possible to analyze larger 
numbers of variables than with other programs. For certain 
special Rasch models such as <2), dedicated programs such as 
RIDA (1989), and PML will generally be faster. If, however, 
the user wants to define his of her own IRT model with 
several dimensions and/or user specified category 
coefficients, LOGIMO is the way to go. LOGIMO is a Pascal 
program that runs VAX system running under VMS. For smaller 
problems there is a PC version (386, with extended memory). 
LOGIMO will be distributed starting somewhere in the summer 
of 1992 by iec PrcGAMMA, P.O.Box 841, 9700 AV Groningen, The 
Netherlands (E-mail: GAMMAS RUG . NL) . 
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Discussion 

In this paper an efficient algorithm is described that 
calculates the parameter estimates of log-linear models 
including log-linear IRT models. The : *gorithm avoids setting 
up the full Item 1 x ... x Item k table by computing the 
parameter estimates from the marginal -sums of the table by a 
modified version of the iterative proportional fitting 
algorithm or the Newton- Raphson algorithm. The computation of 
expected marginal sums is done efficiently using the MBV 
method. 

The methods modified IPF and MBV methods can be seen as 
generalizations of older methods for the estimation of 
unidimensional Rasch models. For this case, the modified IPF 
algorithm turns out to be equivalent to an algorithm proposed 
by Scheiblechner (1971, see Fischer, 1974, p. 247) and the MBV 
method can be shown to be identical to the so called 
summation algorithm for the computation of elementary 
symmetric functions (Andersen, 1972} . To see the latter, 
normalize the parameters in the Rasch model (2) as ag - b 0 = 
C q m <Jq - 1. Elementary symmetric functions can then be 
computed recursively using the following type of relations 

Y m (a 1 ,b 1 ,c 1 ,d 1 ) - y m {a 1 ,b 1 ,c 1 ) + d x Tm-1 < a l' b l' c l ) ' 



and similar relations for y m <&i , bj , c\ ) , Ym< a l' b l>' etc - 
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It is easy to see that this summation is equivalent to the 
left-most summation in (10), and Ym<ai#bi/Ci) and 
T»-l * a l' b l* c l' are e< J^ ivalent to the second summation in 
(10). Thus, the MBV method for computing marginal sums in the 
Rasch model is equivalent Llie summation algorithm for 
computing elementary symmetric functions. Despite this for 
unidimensional Rasch models LOG I MO is genererally slower than 
programs using the sum algorithm that arc dedicated to those 
models- As remarked before its streno* lies in ordinary log* 
linear models and more complicated log-linear IRT models. 

LOGIMO is capable of dealing Kith models with 
interaction terms and multiple weight-sum variables with 
arbitrary weights defined by t:ie user. In these models the 
nice symmetries of the Rasch model are lost. It is an open 
question whether improved methods for computing elementary 
symmetric functions, such as those of Formann (1986) and 
Verhelst, Glas and van der Sluis (1984) f depend on these 
symmetries or and/or can be generalized for use with general 
log-1 inear models . 
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Table 1 

Number of Multiplications and Summations Required bv Summing 
over all CpIIs and the MBV Method to Calculate the Sumscore 

Marginal 



Number of Summing all cells MBV Method 



Items 


x 




x 


+ 


5 


192 


26 


40 


10 


6 


448 


57 


54 


15 


7 


1024 


120 


70 


21 


8 


2304 


247 


88 


28 


9 


5120 


502 


108 


36 


10 


11264 


1013 


130 


45 


11 


24576 


2036 


154 


55 


12 


53248 


4083 


180 


66 


13 


114688 


8178 


208 


78 


14 


245760 


16369 


238 


91 


15 


524288 


32752 


270 


105 


16 


114112 


65519 


304 


120 


17 


2359296 


131054 


340 


136 


18 


4980736 


262125 


378 


153 


19 


10485760 


524268 


418 


17i 


20 


22020096 


1048555 


4 60 


190 
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Table 2 

Real and Estimate d item Difficulties for Simulated Data 
(N^IO. 000) 



Item 



1 2 3 4 5 

Real .858 -1.512 -0.173 -1.040 1.137 

LOGIMO .858* -1.517 -0.214 -1.069 1.161 

PML .858* -1.517 -0.215 -1.069 1.161 



10 



Real 1.354 1.690 0.577 -1.270 -0.155 

LOGIMO 1.318 1.636 0.618 -1.350 -0.154 

PML 1.318 1.636 0.618 -1.349 -0.153 



11 12 13 14 15 



Real 1.302 1.352 -0.823 -0.883 -1.754 

LOGIMO 1.243 1.282 -0.858 0.871 -1.801 

PML 1.244 1.284 -0.857 0.871 -1.801 



16 17 18 19 20 



Real -0.026 0.221 0.517 -0.460 1.658 

LOGIMO -0.038 0.183 0.502 -0.506 1.654 

PML -0.038 0.183 0.502 -0.507 1.653 



*) The estimated parameter of the first item was set equal 
to the real parameter value to fix the scale 
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Figure Captions 



Figure 1 , Growth of the Number of IPF Iterations with the 
Number of J terns in Model 2. 

Figure 2 , Growth of CPU Time per Iteration with the Number of 
Items in Model 2. 

Figure 3 . Growth of CPU Time for Initialization and IPF 
Iterations with the Number of Items in Model 2. 
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